Skip to content

Conversation

jbrockmendel
Copy link
Member

tm.assert_series_equal(result, expected)

result = df.quantile([0.5])
expected = pd.DataFrame([], index=[], columns=[0.5])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this expected. Should index and columns be flipped (and in the implementation too)?

With DataFrame.quantlie([0.5]) the

  • index is the q ([0.5])
  • columns are the (numeric) columns

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, yah. will update

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do we do in the scalar-q case then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think

In [11]: pd.Series(name=0.5)
Out[11]: Series([], Name: 0.5, dtype: float64)

right? So that it's equivalent to

In [18]: pd.DataFrame({"A": [1]}).quantile().drop("A")
Out[18]: Series([], Name: 0.5, dtype: float64)

@gfyoung gfyoung added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug DataFrame DataFrame data structure labels Jul 17, 2019
@jreback jreback added this to the 1.0 milestone Jul 20, 2019
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm ex small comment, needs a whatsnew note (1.0)

# GH#23925 _get_numeric_data may drop all columns
df = pd.DataFrame(pd.date_range("1/1/18", periods=5))
result = df.quantile(0.5)
expected = pd.Series([], index=[], name=0.5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a name to the index of the frame which I think should propagate here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. i think it should be df.columns.name that propagates.

It looks like DataFrame_get_numeric_data isn't retaining the columns names, will open a separate issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I pushed the 1.0 whatsnew if you rebase

@@ -193,6 +193,7 @@ ExtensionArray
Other
^^^^^

- Bug in :meth:`DataFrame.quantile` with zero-column :class:`DataFrame` incorrectly raising (:issue:`23925`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this to numeric & remove the Other section so its not used anymore.

@@ -467,3 +467,17 @@ def test_quantile_empty(self):

# FIXME (gives NaNs instead of NaT in 0.18.1 or 0.19.0)
# res = df.quantile(0.5, numeric_only=False)

def test_quantile_empty_no_columns(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth testing this with timedelta & period data as well (just to confirm that they work), but can do followup.

@jreback jreback merged commit 01babb5 into pandas-dev:master Jul 24, 2019
@jreback
Copy link
Contributor

jreback commented Jul 24, 2019

thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the quantile branch July 24, 2019 13:20
quintusdias pushed a commit to quintusdias/pandas_dev that referenced this pull request Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug DataFrame DataFrame data structure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataFrame Quantile Broken with Datetime Data
4 participants